All the group member participated in all the two assignments, and after discussion, formed this report.
#Q11
#Loading data and setup
nodes <- read.table("trainMeta.dat")
nodes <- cbind(rownames(nodes),nodes) #using the row index as ID
#name the bombing group as "group" so it fit the package requirement
colnames(nodes) <- c("id","label","group") # Change name to label so it can be shown
links <- read.table("trainData.dat")
#name the strength as "width" so it fit the package requirement
colnames(links) <- c("from", "to","width") #for lab requirement a
#lab requirement c
connection_counts <- links %>%
group_by(from) %>%
summarise(value = n())#need to rename to value, so it will show the size
nodes <- merge(nodes, connection_counts, by.x = "id", by.y = "from", all.x = TRUE) #left join to nodes
#Since there are some nodes have zero connection, we set it to one so it won't disappear
#Also, as the 0 connection will become 1, we add 1 to all other counts so the size of 0 connection and 1 connection will be different
nodes$value <- nodes$value +1
nodes$value[is.na(nodes$value)] <- 1
#requirement b and c are handled by default with correct column names
visNetwork(nodes,links, main="Terrorist Network")%>%
visPhysics(solver="repulsion")%>% #lab requirement d
visLayout(randomSeed = 12345)%>% #get consistent plot
visOptions(highlightNearest = list(enabled =TRUE, degree = 1)) #highlight nearest connected nodes
By choosing a node, we can see the first degree of connection. Here.
we choose the clusters below using the name as the cluster center. The
name can be seen when zoom in.
Cluster1:Galeb Kalaje
Cluster2:Jamal Zougam
Cluster3:Abderrahim Zbakh Cluster4:Semaan Gaby Eld
Note: This doesn’t mean the person has the most connection to others.
They are chosen so the overlap between clusters will be lower.
Also, we can see there are some nodes,6 in specific, that are not connected to others, these can be seen as outliers.
#Q12
visNetwork(nodes,links, main="Terrorist Network")%>%
visPhysics(solver="repulsion")%>%
visLayout(randomSeed = 12345)%>%
visOptions(highlightNearest = list(enabled =TRUE, degree = 2))
Here, we checked 3 people that appears to have the most connections,
Jamal Zougam, Mohamed Chaoui and
Imad Eddin Barakat.
Jamal Zougam: He owned a mobile phone shop in the Lavapiés and is the
one that sold the phone to detonate the bomb and prepared the
bomb.
Mohamed Chaoui: Has limited information online, believed to be the
half-brother of Jamal Zougam.
Imad Eddin Barakat: He is thought to have been a high-ranking member of
al-Qaeda’s Madrid component.
Since Imad Eddin Barakat is possibly in a leader position in Al-Qaeda.
We would say he has the best opportunity to spread the information, and
even plan the whole attack.
#Q13
#from course code template
nodes1<-nodes
net <- graph_from_data_frame(d=links, vertices=nodes, directed=F)
ceb <- cluster_edge_betweenness(net)
nodes1$group=ceb$membership
visNetwork(nodes1,links, main="Terrorist Network: Clusters")%>%
visPhysics(solver="repulsion")%>%
visLayout(randomSeed = 12345)%>%
visOptions(highlightNearest = list(enabled =TRUE, degree = 1))
By observing the plot, there are more clusters than the manually identified one we have in step 1. However, the pattern seems rather close, with the blue cluster close to the Cluster1, 2 combined, and green and purple close to Cluster3 and 4 from step 1 respectively.
#Q14
#adjacency representation from course website
net <- graph_from_data_frame(d=links, vertices=nodes, directed=F)
netm <- get.adjacency(net, attr="width", sparse=F)
colnames(netm) <- V(net)$label
rownames(netm) <- V(net)$label
rowdist<-dist(netm)
order1<-seriate(rowdist, "HC")
ord1<-get_order(order1)
reordmatr<-netm[ord1,ord1]
plot_ly(z=~reordmatr, x=~colnames(reordmatr),
y=~rownames(reordmatr), type="heatmap") %>%
layout(xaxis = list(title = ""),yaxis = list(title = ""))
The clusters on the heat map can be seen as the place that contains numbers of lighter color blocks, they can mostly be found on the diagonal line. The most pronounced one is on the upper right corner where we have people like, Jamal Zougam, Imad Eddin Barakat, Mohamed Belfatmi, Amer Azizi, Galeb Kalaje, Shakur and so on. This cluster is the blue cluster that we have already discovered in step 3.
#import the data
oilcoal<-read.csv2("Oilcoal.csv")
oilcoal<-oilcoal[,-6]
#use plotly to draw an animated bubble chart
bubble_chart<-plot_ly(oilcoal,
x=~Oil,
y=~Coal,
frame =~Year,
type = 'scatter',
mode = 'markers',
size = ~Marker.size,
color =~Country)%>%
animation_opts(200, easing = "cubic", redraw = F)
bubble_chart
From 2002, the consumption of coal increased quickly in China.
Between 1965 and 1972, the consumption of oil gradually increased in the US. However, from 1973 to 1975 the consumption of oil decreased and then increased. But from 1979 to 1983, the consumption of oil decreased again and then increased slowly.
There is a great relationship between these two kinds of consumption and population size. The larger the population, the greater the consumption.
data<-oilcoal[oilcoal$Country=="US"|oilcoal$Country=="Japan",]
#use plotly to draw an animated bubble chart
similar_plot<-plot_ly(data,
x=~Oil,
y=~Coal,
frame =~Year,
type = 'scatter',
mode = 'markers',
size = ~Marker.size,
color =~Country,
colors = c("red", "blue"))
similar_plot
From the plot, it seems Japan and the US have similar motion patterns. It maybe related to the oil crisis. The first oil crisis occured in 1973, it affected the consumption of oil in Japan and US. So from the plot above, we can see the consumption decreased from 1973 to 1975. And also the second oil crisis occured in 1979. The consumption of oil decreased again from 1979.
#calculate the proportion of fuel consumption
oilcoal$Oil_p<-oilcoal$Oil/(oilcoal$Oil+oilcoal$Coal)*100
#(a)create the new data frame
dup_row=oilcoal[1,]
new_oilcoal<-data.frame()
for(i in 1:nrow(oilcoal)){
dup_row<-rbind(oilcoal[i,],oilcoal[i,])
dup_row[2,6]<-0
new_oilcoal<-rbind(new_oilcoal,dup_row)
}
#(b)make the animated line plot of Oilp
Oilp_plot<-plot_ly(new_oilcoal,
x=~Country,
y=~Oil_p,
frame =~Year,
mode = 'markers',
line = list(width =20),
color =~Country)%>%
add_lines()
Oilp_plot
Advantage: Compared to the bubble chart, the variation of the fuel along y_axis between all the countries is very clear. We can focus on all the countries at the same time. In the bubble chart, the variations are not very clear except for the US and China with the relatively large values.
Disadvantage: In the bar chart, it only focuses on the proportion of the fuel consumption. It is hard to check one of the fuels (oil or coal) trend.
Oilp_plot2<-plot_ly(new_oilcoal,
x=~Country,
y=~Oil_p,
frame =~Year,
mode = 'markers',
line = list(width =20),
color =~Country)%>%
add_lines()%>%
animation_opts(frame=400, easing = "elastic", redraw = F)
Oilp_plot2
Advantage: In the previous plot without using easing, the changes between the two values in each bar move in a constant rate. As it is kind of a linear animation, it may appears to be unnatural. However, when using easing to get the animation chart. The changes between two values in each bar have bounce effect. As the speed of changes is different, it makes the animation more natural and can draw reader’s attention.
Disadvantage: When choosing the wrong easing settings, it may leads to misunderstanding when analyzing the plot as the bouncing will become distracting. This might also leads to the reader mistakenly associate the visual effect with the true value.
#observations are years(rownames) and variables are different countries(colname)
#create the dataset
year<-unique(oilcoal$Year)
country<-unique(oilcoal$Country)
dataset<-data.frame(matrix(ncol = 0, nrow = 0))
for(i in 1:length(year)){
for(j in 1:length(country)){
dataset[i,j]<-oilcoal[oilcoal$Country==country[j]&oilcoal$Year==year[i],3]
}
}
rownames(dataset)<-year
colnames(dataset)<-country
#refer to the lecture code
set.seed(12345)
dataset<-rescale(dataset)
tour<- new_tour(dataset, guided_tour(cmass), NULL)
steps <- c(0, rep(1/15, 200))
Projs<-lapply(steps, function(step_size){
step <- tour(step_size)
if(is.null(step)) {
.GlobalEnv$tour<- new_tour(dataset, guided_tour(cmass), NULL)
step <- tour(step_size)
}
step
}
)
## Value 0.665 21.8% better (0.781 away) - NEW BASIS
## Value 0.892 35.2% better (0.781 away) - NEW BASIS
## Value 0.950 7.1% better (0.513 away) - NEW BASIS
## Value 0.972 2.3% better (0.446 away) - NEW BASIS
## Value 0.975 0.3% better (0.111 away) - NEW BASIS
## Value 0.979 0.4% better (0.093 away) - NEW BASIS
## Value 0.982 0.5% better (0.132 away) - NEW BASIS
## Value 0.984 0.2% better (0.095 away) - NEW BASIS
## Value 0.985 0.1% better (0.059 away) - NEW BASIS
## Value 0.990 0.5% better (0.403 away) - NEW BASIS
## Value 0.990 0.0% better (0.028 away)
## Value 0.990 0.0% better (0.052 away)
## Value 0.991 0.1% better (0.099 away) - NEW BASIS
## Value 0.991 0.0% better (0.046 away)
## Value 0.992 0.1% better (0.080 away)
## Value 0.991 0.0% better (0.026 away)
## Value 0.991 0.0% better (0.033 away)
## Value 0.991 0.0% better (0.038 away)
## Value 0.991 0.0% better (0.029 away)
## Value 0.991 0.0% better (0.035 away)
## Value 0.991 0.0% better (0.041 away)
## Value 0.991 0.0% better (0.052 away)
## Value 0.991 0.0% better (0.050 away)
## Value 0.991 0.0% better (0.032 away)
## Value 0.991 0.0% better (0.037 away)
## Value 0.991 0.0% better (0.019 away)
## Value 0.992 0.1% better (0.071 away)
## Value 0.991 0.0% better (0.031 away)
## Value 0.991 0.0% better (0.023 away)
## Value 0.991 0.0% better (0.025 away)
## Value 0.991 0.0% better (0.021 away)
## Value 0.991 0.0% better (0.026 away)
## Value 0.991 0.0% better (0.031 away)
## Value 0.992 0.1% better (0.092 away)
## Value 0.991 0.0% better (0.034 away)
## Value 0.991 0.0% better (0.027 away)
## Value 0.992 0.1% better (0.084 away) - NEW BASIS
## Value 0.992 0.0% better (0.034 away)
## Value 0.992 0.0% better (0.029 away)
## Value 0.992 0.0% better (0.031 away)
## Value 0.992 0.0% better (0.023 away)
## Value 0.992 0.0% better (0.028 away)
## Value 0.992 0.1% better (0.066 away)
## Value 0.992 0.0% better (0.040 away)
## Value 0.992 0.0% better (0.019 away)
## Value 0.992 0.0% better (0.022 away)
## Value 0.992 0.0% better (0.044 away)
## Value 0.992 0.0% better (0.034 away)
## Value 0.992 0.1% better (0.055 away)
## Value 0.992 0.0% better (0.023 away)
## Value 0.992 0.0% better (0.031 away)
## Value 0.992 0.0% better (0.021 away)
## Value 0.992 0.0% better (0.022 away)
## Value 0.992 0.0% better (0.015 away)
## Value 0.992 0.0% better (0.014 away)
## Value 0.992 0.0% better (0.068 away)
## Value 0.992 0.0% better (0.057 away)
## Value 0.992 0.0% better (0.017 away)
## Value 0.992 0.0% better (0.016 away)
## Value 0.992 0.0% better (0.018 away)
## Value 0.992 0.0% better (0.049 away)
## No better bases found after 25 tries. Giving up.
## Final projection:
## 0.339 -0.315
## -0.142 0.542
## -0.541 0.107
## 0.037 0.111
## 0.474 -0.252
## -0.571 -0.571
## 0.117 0.397
## 0.084 -0.187
## Value 0.753 37.9% better (0.781 away) - NEW BASIS
## Value 0.900 20.5% better (0.569 away) - NEW BASIS
## Value 0.953 5.9% better (0.483 away) - NEW BASIS
## Value 0.956 0.4% better (0.113 away) - NEW BASIS
## Value 0.972 1.6% better (0.291 away) - NEW BASIS
## Value 0.975 0.4% better (0.075 away) - NEW BASIS
## Value 0.977 0.2% better (0.087 away) - NEW BASIS
## Value 0.984 0.7% better (0.207 away) - NEW BASIS
## Value 0.984 0.1% better (0.050 away)
## Value 0.984 0.0% better (0.029 away)
## Value 0.984 0.0% better (0.032 away)
## Value 0.984 0.0% better (0.019 away)
## Value 0.984 0.0% better (0.031 away)
## Value 0.984 0.0% better (0.030 away)
## Value 0.984 0.0% better (0.019 away)
## Value 0.984 0.0% better (0.039 away)
## Value 0.984 0.0% better (0.022 away)
## Value 0.984 0.0% better (0.031 away)
## Value 0.984 0.0% better (0.018 away)
## Value 0.984 0.0% better (0.050 away)
## Value 0.984 0.0% better (0.025 away)
## Value 0.984 0.0% better (0.021 away)
## Value 0.984 0.0% better (0.024 away)
## Value 0.984 0.0% better (0.037 away)
## Value 0.984 0.0% better (0.050 away)
## Value 0.984 0.0% better (0.021 away)
## Value 0.984 0.1% better (0.045 away)
## Value 0.984 0.0% better (0.033 away)
## Value 0.984 0.0% better (0.032 away)
## Value 0.984 0.0% better (0.046 away)
## Value 0.984 0.0% better (0.018 away)
## Value 0.984 0.0% better (0.020 away)
## No better bases found after 25 tries. Giving up.
## Final projection:
## 0.783 -0.043
## -0.152 0.657
## -0.114 0.354
## -0.062 -0.060
## 0.145 -0.320
## -0.193 -0.196
## -0.477 -0.096
## -0.247 -0.537
## Value 0.834 52.8% better (0.781 away) - NEW BASIS
## Value 0.968 17.4% better (0.756 away) - NEW BASIS
## Value 0.981 1.4% better (0.156 away) - NEW BASIS
## Value 0.983 0.2% better (0.117 away) - NEW BASIS
## Value 0.983 0.1% better (0.028 away)
## Value 0.983 0.0% better (0.027 away)
## Value 0.984 0.2% better (0.088 away) - NEW BASIS
## Value 0.984 0.0% better (0.014 away)
## Value 0.984 0.0% better (0.025 away)
## Value 0.985 0.1% better (0.086 away)
## Value 0.985 0.1% better (0.189 away) - NEW BASIS
## Value 0.985 0.0% better (0.054 away)
## Value 0.986 0.1% better (0.056 away)
## Value 0.986 0.0% better (0.028 away)
## Value 0.986 0.1% better (0.066 away)
## Value 0.986 0.0% better (0.036 away)
## Value 0.985 0.0% better (0.018 away)
## Value 0.986 0.0% better (0.029 away)
## Value 0.986 0.1% better (0.034 away)
## Value 0.986 0.0% better (0.027 away)
## Value 0.985 0.0% better (0.025 away)
## Value 0.986 0.0% better (0.033 away)
## Value 0.986 0.0% better (0.048 away)
## Value 0.986 0.0% better (0.022 away)
## Value 0.986 0.1% better (0.067 away)
## Value 0.986 0.0% better (0.029 away)
## Value 0.986 0.1% better (0.032 away)
## Value 0.985 0.0% better (0.019 away)
## Value 0.986 0.0% better (0.043 away)
## Value 0.985 0.0% better (0.020 away)
## Value 0.986 0.1% better (0.064 away)
## Value 0.985 0.0% better (0.017 away)
## Value 0.986 0.1% better (0.098 away) - NEW BASIS
## Value 0.986 0.0% better (0.020 away)
## Value 0.986 0.0% better (0.021 away)
## Value 0.986 0.0% better (0.018 away)
## Value 0.986 0.0% better (0.030 away)
## Value 0.986 0.0% better (0.033 away)
## Value 0.986 0.0% better (0.040 away)
## Value 0.987 0.0% better (0.042 away)
## Value 0.987 0.0% better (0.035 away)
## Value 0.987 0.0% better (0.034 away)
## Value 0.986 0.0% better (0.021 away)
## Value 0.986 0.0% better (0.076 away)
## Value 0.986 0.0% better (0.025 away)
## Value 0.987 0.0% better (0.044 away)
## Value 0.986 0.0% better (0.022 away)
## Value 0.987 0.0% better (0.024 away)
## Value 0.986 0.0% better (0.012 away)
## Value 0.986 0.0% better (0.011 away)
## Value 0.987 0.0% better (0.041 away)
## Value 0.987 0.0% better (0.039 away)
## Value 0.987 0.1% better (0.103 away)
## Value 0.987 0.1% better (0.039 away)
## Value 0.986 0.0% better (0.031 away)
## Value 0.987 0.0% better (0.055 away)
## Value 0.987 0.0% better (0.052 away)
## No better bases found after 25 tries. Giving up.
## Final projection:
## 0.815 -0.086
## -0.274 0.740
## -0.022 0.148
## 0.001 -0.157
## -0.043 0.002
## -0.300 -0.087
## -0.027 -0.174
## -0.409 -0.601
## Value 0.787 44.3% better (0.781 away) - NEW BASIS
## Value 0.925 18.8% better (0.726 away) - NEW BASIS
# projection of each observation
tour_dat <- function(i) {
step <- Projs[[i]]
proj <- center(dataset %*% step$proj)
data.frame(x = proj[,1], y = proj[,2], state = rownames(dataset))
}
# projection of each variable's axis
proj_dat <- function(i) {
step <- Projs[[i]]
data.frame(
x = step$proj[,1], y = step$proj[,2], variable = colnames(dataset)
)
}
stepz <- cumsum(steps)
# tidy version of tour data
tour_dats <- lapply(1:length(steps), tour_dat)
tour_datz <- Map(function(x, y) cbind(x, step = y), tour_dats, stepz)
tour_dat <- dplyr::bind_rows(tour_datz)
# tidy version of tour projection data
proj_dats <- lapply(1:length(steps), proj_dat)
proj_datz <- Map(function(x, y) cbind(x, step = y), proj_dats, stepz)
proj_dat <- dplyr::bind_rows(proj_datz)
ax <- list(
title = "", showticklabels = FALSE,
zeroline = FALSE, showgrid = FALSE,
range = c(-1.1, 1.1)
)
# for nicely formatted slider labels
options(digits = 3)
tour_dat <- highlight_key(tour_dat, ~state, group = "A")
tour <- proj_dat %>%
plot_ly(x = ~x, y = ~y, frame = ~step, color = I("black")) %>%
add_segments(xend = 0, yend = 0, color = I("gray80")) %>%
add_text(text = ~variable) %>%
add_markers(data = tour_dat, text = ~state, ids = ~state, hoverinfo = "text") %>%
layout(xaxis = ax, yaxis = ax)#%>%animation_opts(frame=0, transition=0, redraw = F)
tour
#make a time series plot for the Coal consumption of US country
US_coal<-data[data$Country=="US",3]
US_coal<-data.frame(Coal=US_coal)
plot_ly(US_coal, type = 'scatter', mode = 'lines')%>%
add_trace(y= ~Coal)%>%
layout(title = "time series plot for the Coal consumption of US",
xaxis = list(title = 'times'))
By observing the data point from different angles using the animation plot, it seems the projection around step:7 has the most compact and well-separated clusters. We can see two clusters and one outlier. One cluster is the year from 1965 to 1983, another cluster is the year from 1984 to 2008. The year of 2009 is an outlier, as it is far from other years. So we can say clusters correspond to different year ranges.
Us has the largest contribution to this projection. As the consumption of coal in the US is much larger than most countries and it does not change so quickly.
knitr::opts_chunk$set(echo = TRUE)
rm(list=ls())
library(plotly)
library(dplyr)
library(tourr)
library(ggraph)
library(igraph)
library(visNetwork)
library(seriation)
#Q11
#Loading data and setup
nodes <- read.table("trainMeta.dat")
nodes <- cbind(rownames(nodes),nodes) #using the row index as ID
#name the bombing group as "group" so it fit the package requirement
colnames(nodes) <- c("id","label","group") # Change name to label so it can be shown
links <- read.table("trainData.dat")
#name the strength as "width" so it fit the package requirement
colnames(links) <- c("from", "to","width") #for lab requirement a
#lab requirement c
connection_counts <- links %>%
group_by(from) %>%
summarise(value = n())#need to rename to value, so it will show the size
nodes <- merge(nodes, connection_counts, by.x = "id", by.y = "from", all.x = TRUE) #left join to nodes
#Since there are some nodes have zero connection, we set it to one so it won't disappear
#Also, as the 0 connection will become 1, we add 1 to all other counts so the size of 0 connection and 1 connection will be different
nodes$value <- nodes$value +1
nodes$value[is.na(nodes$value)] <- 1
#requirement b and c are handled by default with correct column names
visNetwork(nodes,links, main="Terrorist Network")%>%
visPhysics(solver="repulsion")%>% #lab requirement d
visLayout(randomSeed = 12345)%>% #get consistent plot
visOptions(highlightNearest = list(enabled =TRUE, degree = 1)) #highlight nearest connected nodes
#Q12
visNetwork(nodes,links, main="Terrorist Network")%>%
visPhysics(solver="repulsion")%>%
visLayout(randomSeed = 12345)%>%
visOptions(highlightNearest = list(enabled =TRUE, degree = 2))
#Q13
#from course code template
nodes1<-nodes
net <- graph_from_data_frame(d=links, vertices=nodes, directed=F)
ceb <- cluster_edge_betweenness(net)
nodes1$group=ceb$membership
visNetwork(nodes1,links, main="Terrorist Network: Clusters")%>%
visPhysics(solver="repulsion")%>%
visLayout(randomSeed = 12345)%>%
visOptions(highlightNearest = list(enabled =TRUE, degree = 1))
#Q14
#adjacency representation from course website
net <- graph_from_data_frame(d=links, vertices=nodes, directed=F)
netm <- get.adjacency(net, attr="width", sparse=F)
colnames(netm) <- V(net)$label
rownames(netm) <- V(net)$label
rowdist<-dist(netm)
order1<-seriate(rowdist, "HC")
ord1<-get_order(order1)
reordmatr<-netm[ord1,ord1]
plot_ly(z=~reordmatr, x=~colnames(reordmatr),
y=~rownames(reordmatr), type="heatmap") %>%
layout(xaxis = list(title = ""),yaxis = list(title = ""))
#import the data
oilcoal<-read.csv2("Oilcoal.csv")
oilcoal<-oilcoal[,-6]
#use plotly to draw an animated bubble chart
bubble_chart<-plot_ly(oilcoal,
x=~Oil,
y=~Coal,
frame =~Year,
type = 'scatter',
mode = 'markers',
size = ~Marker.size,
color =~Country)%>%
animation_opts(200, easing = "cubic", redraw = F)
bubble_chart
data<-oilcoal[oilcoal$Country=="US"|oilcoal$Country=="Japan",]
#use plotly to draw an animated bubble chart
similar_plot<-plot_ly(data,
x=~Oil,
y=~Coal,
frame =~Year,
type = 'scatter',
mode = 'markers',
size = ~Marker.size,
color =~Country,
colors = c("red", "blue"))
similar_plot
#calculate the proportion of fuel consumption
oilcoal$Oil_p<-oilcoal$Oil/(oilcoal$Oil+oilcoal$Coal)*100
#(a)create the new data frame
dup_row=oilcoal[1,]
new_oilcoal<-data.frame()
for(i in 1:nrow(oilcoal)){
dup_row<-rbind(oilcoal[i,],oilcoal[i,])
dup_row[2,6]<-0
new_oilcoal<-rbind(new_oilcoal,dup_row)
}
#(b)make the animated line plot of Oilp
Oilp_plot<-plot_ly(new_oilcoal,
x=~Country,
y=~Oil_p,
frame =~Year,
mode = 'markers',
line = list(width =20),
color =~Country)%>%
add_lines()
Oilp_plot
Oilp_plot2<-plot_ly(new_oilcoal,
x=~Country,
y=~Oil_p,
frame =~Year,
mode = 'markers',
line = list(width =20),
color =~Country)%>%
add_lines()%>%
animation_opts(frame=400, easing = "elastic", redraw = F)
Oilp_plot2
#observations are years(rownames) and variables are different countries(colname)
#create the dataset
year<-unique(oilcoal$Year)
country<-unique(oilcoal$Country)
dataset<-data.frame(matrix(ncol = 0, nrow = 0))
for(i in 1:length(year)){
for(j in 1:length(country)){
dataset[i,j]<-oilcoal[oilcoal$Country==country[j]&oilcoal$Year==year[i],3]
}
}
rownames(dataset)<-year
colnames(dataset)<-country
#refer to the lecture code
set.seed(12345)
dataset<-rescale(dataset)
tour<- new_tour(dataset, guided_tour(cmass), NULL)
steps <- c(0, rep(1/15, 200))
Projs<-lapply(steps, function(step_size){
step <- tour(step_size)
if(is.null(step)) {
.GlobalEnv$tour<- new_tour(dataset, guided_tour(cmass), NULL)
step <- tour(step_size)
}
step
}
)
# projection of each observation
tour_dat <- function(i) {
step <- Projs[[i]]
proj <- center(dataset %*% step$proj)
data.frame(x = proj[,1], y = proj[,2], state = rownames(dataset))
}
# projection of each variable's axis
proj_dat <- function(i) {
step <- Projs[[i]]
data.frame(
x = step$proj[,1], y = step$proj[,2], variable = colnames(dataset)
)
}
stepz <- cumsum(steps)
# tidy version of tour data
tour_dats <- lapply(1:length(steps), tour_dat)
tour_datz <- Map(function(x, y) cbind(x, step = y), tour_dats, stepz)
tour_dat <- dplyr::bind_rows(tour_datz)
# tidy version of tour projection data
proj_dats <- lapply(1:length(steps), proj_dat)
proj_datz <- Map(function(x, y) cbind(x, step = y), proj_dats, stepz)
proj_dat <- dplyr::bind_rows(proj_datz)
ax <- list(
title = "", showticklabels = FALSE,
zeroline = FALSE, showgrid = FALSE,
range = c(-1.1, 1.1)
)
# for nicely formatted slider labels
options(digits = 3)
tour_dat <- highlight_key(tour_dat, ~state, group = "A")
tour <- proj_dat %>%
plot_ly(x = ~x, y = ~y, frame = ~step, color = I("black")) %>%
add_segments(xend = 0, yend = 0, color = I("gray80")) %>%
add_text(text = ~variable) %>%
add_markers(data = tour_dat, text = ~state, ids = ~state, hoverinfo = "text") %>%
layout(xaxis = ax, yaxis = ax)#%>%animation_opts(frame=0, transition=0, redraw = F)
tour
#make a time series plot for the Coal consumption of US country
US_coal<-data[data$Country=="US",3]
US_coal<-data.frame(Coal=US_coal)
plot_ly(US_coal, type = 'scatter', mode = 'lines')%>%
add_trace(y= ~Coal)%>%
layout(title = "time series plot for the Coal consumption of US",
xaxis = list(title = 'times'))